Reaction Pathways Based on the Gradient of the Mean First-Passage Time 
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Finding representative reaction pathways is necessary for understanding mechanisms of molecular 
processes, but is considered to be extremely challenging. We propose a new method to construct 
reaction paths based on mean first-passage times. This approach incorporates information of all 
possible reaction events as well as the effect of temperature. The method is applied to exemplary 
reactions in a continuous and in a discrete setting. The suggested approach holds great promise for 
large reaction networks that are completely characterized by the method through a pathway graph. 



Introduction. In many physical, chemical, or bio- 
logical reactions, initial (reactant) and final (product) 
states are known but reaction pathways are not. Ex- 
amples range in complexity from single-particle Brown- 
ian motion to conformational changes of proteins, such 
as protein folding jj], ||. Finding reaction pathways is 
one of the most fundamental challenges in chemistry and 
molecular biology^, ||. It is important for understanding 
reaction mechanisms and for the calculation of reaction 
rates ||. 

In most cases of interest, reactions take place at fi- 
nite temperature and therefore are stochastic: every re- 
action event follows a different path and takes a different 
amount of time. Among all possible paths from reac- 
tant to product, one seeks the path that best character- 
izes the reaction The selection criteria are twofold. 
First, one seeks minimal paths, free of unnecessary fluc- 
tuations. Second, and more important, one seeks paths 
that are representative of all reaction events so that indi- 
vidual reaction events can be considered to be following 
noisy paths around them. It is, however, challenging to 
formulate these criteria rigorously. 

The most widely used formulation of reaction path is 
probably the steepest-descent path, which is constructed 
by identifying saddle points of the potential energy sur- 
face and then following the steepest descent from the 
saddle points such that energy barriers along the path 
are minimized ||. The steepest-descent path, however, 
does not involve temperature. Since reactions are driven 
by thermal fluctuations, reaction paths should depend 
on temperature. For example, if there is a direct path 
with high energy barriers and a roundabout path with 
low energy barriers, at a temperature higher than the 
barriers reaction will occur most likely along the direct 
path rather than the roundabout path while the steepest- 
descent path will be the roundabout path regardless of 
temperature. 

Understanding this drawback of the steepest-descent 
path has led to alternative formulations of reaction path. 
One approach is to select the path of maximum flux @, B. 
Another formulation focuses on most probable paths B, 
|To| |. In this approach, an ensemble of reaction events of 
a fixed time interval is considered. A probability is then 
assigned to each event, and the path followed by the most 



FIG. 1: Schematic pictures of reaction paths. S is the set of 
all accessible states, 1Z the reactant, and V the product, (a) A 
continuous system. Dashed lines are contours of the reaction 
coordinate (the MFPT to the product V). In a Cartesian 
coordinate system, they are orthogonal to the reaction paths, 
(b) A discrete system. Dots are the accessible states. 



probable event is taken as the reaction path |Tl[ . 

The above methods succeeded to a certain extent in 
elucidating reaction mechanisms, but they do not fully 
satisfy the criterion of representativity. The method of 
most probable path comes closest to satisfying the crite- 
rion, but it is not clear how to choose the time interval 
beforehand and whether an ensemble of reaction events 
of a single time interval suffices to represent the reaction. 

In this paper we present a new formulation of reaction 
path. While previous approaches attempted to quantify 
paths, we use the concept of reaction coordinate which 
quantifies states. Reaction coordinate is a function that 
describes where in the progress of reaction a state is lo- 
cated. The most natural measure of the progress of reac- 
tion is provided by the mean first-passage time (MFPT) , 
namely the average amount of time that the system start- 
ing from the state takes to reach the given product. The 
MFPT depends on the energy landscape, the tempera- 
ture, as well as the boundary conditions, and most im- 
portant, it is an average over all reaction events. Surpris- 
ingly, to our knowledge, the MFPT has never been used 
as a reaction coordinate before. Once the MFPT is deter- 
mined, one can choose as reaction paths the paths along 
which the MFPT decreases most rapidly, which complies 
with the criterion of minimal path. 

Theory. A practical scheme of determining reaction 
coordinates and reaction paths based on MFPTs can 
indeed be stated, as described below and illustrated in 
Fig. |l]. Consider a system undergoing a stochastic reac- 
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tion and let S be the set of all states that the system can 
access. If S is continuous the reaction can be described 
by a Fokker-Planck-type equation, and if S is discrete it 
can be described by a master equation Jl2| . The reactant 
and the product are specified by disjoint subsets, 1Z and 
V respectively, of S. The reaction can be considered a 
first-passage process Q because a reaction event ends 
as soon as the system reaches any state in V . The MFPT 
t(x) from state x to the product V is then calculated for 
all states x in S (this involves solving an inhomogeneous 
partial differential equation when S is continuous and 
inverting a transition matrix when discrete |l4|, as 
demonstrated later) and is used as a reaction coordinate. 
The location of the reactant set 1Z does not affect the 
calculation of the reaction coordinate r(x); it is involved 
only in determining reaction paths. 

The scheme of constructing reaction paths from MF- 
PTs depends on the character of the set S. When S 
is continuous and described by a Cartesian coordinate 
system, reaction paths are constructed following the di- 
rection of — Vr, along which t decreases most rapidly. 
Thus, a reaction path x(Z), parameterized through an 
arc length I, satisfies 



dxi 



-1/2 



dr 



(1) 



Often, reactions are better described with non-Cartesian 
coordinates such as angles. In such cases reaction paths 
can be determined via a transformation to a Cartesian 
coordinate system, and the resulting equation of reaction 
path is 
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Here g~j 1 is the inverse matrix of the metric tensor g^ and 
I is now the arc length with respect to the non-Cartesian 
coordinate system (dl 2 =Xh dxidxi). 

When S is discrete and the reaction is governed by a 
master equation with transition rates k yx (from state x 
to state y), the MFPT t x from state x to the product 
V is again employed as a reaction coordinate. But in 
order to determine reaction paths an analogue of met- 
ric is required, as reaction paths for continuous systems 
are determined via gradients which involve metric. The 
most obvious choice for an analogue of metric is the tran- 
sition rates k yx themselves, and we suggest the scheme 
that a reaction path going through state x chooses the 
next state y such that k yx (T x — T y ) is maximized. For 
the transition step from x to y, the transition time l/k yx 
may be interpreted as a cost, and the decrease t x — t v 
in the MFPT as a gain. The scheme then amounts to 
maximizing the ratio between these two times, namely 
the gain-cost ratio, for each step. 

According to the above scheme, a reaction path is con- 
structed starting from each state in the reactant set 1Z. In 




FIG. 2: A contour plot of the three-hole potential, with two 
candidates for reaction path. 



general, multiple reaction paths are obtained unless the 
reactant set is narrowed down to a single state. Some 
reaction paths may overlap if they go through a common 
state. 

Examples. To demonstrate the above scheme and 
its outcomes, we consider first a single-particle Brownian 
motion on a two-dimensional potential surface described 
with a Cartesian coordinate system (x\, x 2 )- We take the 
three-hole potential 

U(x 1 ,x 2 ) = _ 3e -M^- 5 /3) 2 +3e -*?-(z2-i/3) 2 

-5e-^-^ 2 -^ - 5e-( Xl+ V 2 - x * (3) 

which was also studied by others regarding the tempera- 
ture dependence of reaction paths @, |o) . The potential 
features two deep holes and one shallow hole (Fig. ||). 
The two deep holes are considered as the reactant and 
the product. Roughly two possible pathways can be seen; 
the upper path is longer than the lower one but has lower 
energy barriers. It is therefore expected that the upper 
path will be taken at low temperature and that the lower 
path will be taken at high temperature. 

The Brownian motion can be described in terms of 
the probability distribution p(x, t) and the probability 
current J(x, t). In the strong friction regime, they satisfy 
the Smoluchowski equation Q 



-V- J(x, t) 

J(x, t) = -(/3 7 )- 1 e- /3a(x) V[e^( x )p(x, *)] , (4) 



where 7 is the friction coefficient. The MFPT r(x) is 
then obtained by solving the inhomogeneous partial dif- 
ferential equation 



V • [e-^WVr(x)] = -p^^ 



(5) 



with appropriate boundary conditions H] . We take the 
region (—4 < x\ < 4, —3 < x 2 < 4) as the whole set S, 
and assume that its boundary is reflecting, namely the 
probability current J is tangential to the boundary. For 
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FIG. 3: Reaction paths of the Brownian motion on the three- 
hole potential. Top, = 1; bottom, ,3 = 8. The directions of 
— Vr at selected grid points are plotted as arrows. Contours 
of the reaction coordinate r (thick lines) and the potential 
(thin lines) are shown. The product region is indicated by 
the filled circle. 



the reactant 1Z we take the single point (—1,0), and for 
the product V we take the circular region of radius 0.1 
centered at (1,0). Because a reaction event ends when 
the particle reaches the product region V , the bound- 
ary of V is absorbing, with the probability distribution 
p vanishing at the boundary. Boundary conditions for p 
and J lead to corresponding boundary conditions for t: 
t vanishes at absorbing boundaries and Vr is tangential 
to reflecting boundaries flljj . 

The solutions of Eq. rtq), numerically obtained with 
MATLAB [16], for two different temperatures are shown 
in Fig. ||. The differences between the two temperatures 
are dramatic. At the high temperature (f3 — 1) the arrows 
denoting the directions of — Vr flow more or less directly 
towards the product. At the low temperature (f3 — 8), on 
the other hand, the flow is significantly distorted so that 
energy barriers are avoided, with a singular point pro- 
duced around (—1.5, —0.5). Also, the reaction coordinate 
t drops rapidly when barriers are crossed, as indicated 
by the contours of r packed around saddle points. 

Fig. |I] shows the reaction paths found at various tem- 
peratures. As was expected, lower paths are taken at 




FIG. 4: Temperature dependence of reaction paths of the 
Brownian motion on the three-hole potential. Shown are eight 
reaction paths for eight different temperatures, from bottom 
to top, /3=1,2, 3, 4, 5, 6, 7, 8. The reactant is the point ( — 1, 0), 
and the product is the region indicated by the filled circle. 



high temperature and upper paths are taken at low tem- 
perature. At intermediate temperature, such as /3 = 4, 
reaction paths lie in between, indicating that the upper 
and lower paths are equally favorable. 

In order to demonstrate the scheme for reaction paths 
of discrete systems, we consider light-harvesting com- 
plexes [|l7| [l8|, [n]] , choosing photosystem I of cyanobac- 
terium Synechococcus elongatus as an example. Photo- 
system I is a protein-pigment complex, embedded in the 
cell membrane, that contains 96 chlorophylls. The aggre- 
gate of these chlorophylls is responsible for the first step 
in photosynthesis, namely absorption of light and sub- 
sequent migration of the resulting electronic excitation 
towards the special pair of chlorophylls, called P700 and 
located at the geometrical center of the aggregate, where 
the next step in photosynthesis, the charge separation, 
occurs. This excitation migration can be considered a 
reaction. Assuming that no more than a single chloro- 
phyll is simultaneously excited in the system, the states 
are specified by the excited chlorophylls. The reactant is 
specified by the chlorophyll that initially absorbed a pho- 
ton; the product is specified the P700 pair of chlorophylls. 
Reaction paths of this system provide representative and 
minimal pathways of the excitation migration from the 
initially excited chlorophyll to the P700 pair. 

Since we are interested in first passage of excitation 
to P700, it is convenient to consider a subsystem of 
94 chlorophylls, excluding the P700 pair of chlorophylls. 
The migration of excitation in this subsystem can be de- 
scribed by the master equation 



dt 



Px(t) = V] K XyP y(t) , 



(6) 



where p x (t) is the probability that the excitation resides 
at chlorophyll x at time t and K xy is a 94 x 94 transition 
matrix [^0| plL We build the transition matrix K xy by 
using the inter-chlorophyll excitation transfer rates that 
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FIG. 5: Reaction paths of the excitation migration in pho- 
tosystem I. The positions of the chlorophylls, projected onto 
the membrane plane, are denoted by circles colored according 
to the MFPT: black, short MFPT; gray, intermediate MFPT; 
white, long MFPT. 



were calculated in Ref. |22j based on the theory developed 
in and the recently obtained structure of photosys- 
tem I [||. The MFPT t x from chlorophyll x to the P700 
pair is then pffl 



K K yx 



-1 
y- Ly yx 



(7) 



Here K X y is the inverse matrix of K xy 



and £ x is the 

excitation transfer rate from chlorophyll x to P700. The 
reaction paths constructed from the MFPT are shown in 
Fig. H. A detailed discussion of this system is reported 
in g. 

Conclusion. We have presented a new method to 
construct reaction paths based on the MFPT gradient 
that incorporates all reaction events, and illustrated how 
it captures important aspects of reactions, most no- 
tably temperature effects. Unlike previous attempts, the 
present method provides paths starting from all states 
of the system. We believe that the MFPT is the most 
natural choice for reaction coordinate, and expect that 
our approach will be used in many future studies of re- 
actions. The generality of MFPT permits applications 
of the present method to a wide range of phenomena. 
The present method is particularly suitable for large re- 
action networks that are completely characterized by the 
method through a pathway graph that connect each state 
to the product. 

We thank Paul Grayson and Melih K. §ener for useful 
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